A Procedure for Multi-Class Discrimination and some Linguistic Applications

نویسندگان

Vladimir Pericliev

Raúl E. Valdés-Pérez

چکیده

The paper describes a novel computational tool for multiple concept learning. Unlike previous approaches, whose major goal is prediction on unseen instances rather than the legibility of the output, our MPD (Maximally Parsimonious Discrimination) program emphasizes the conciseness and intelligibility of the resultant class descriptions, using three intuitive simplicity criteria to this end. We illustrate MPD with applications in componential analysis (in lexicology and phonology), language typology, and speech pathology. 1 I n t r o d u c t i o n A common task of knowledge discovery is nmltiple concept learning, in which from multiple given classes (i.e. a typology) the profiles of these classes are inferred, such that every class is contrasted from every other class by feature values. Ideally, good profiles, besides making good predictions on fllture instances, should be concise, intelligible, and comprehensive (i.e. yielding all alternatives). Previous approaches like ID3 (Quinlan, 1983) or C4.5 (Quinlan, 1993), which use variations on greedy search, i.e. localized best-next-step search (typically based on information-gain heuristics), have as their major goal prediction on unseen instances, and therefore do not have as an explicit concern the conciseness, intelligibility, and comprehensiveness of the output. In contrast to virtually all previous approaches to multi-class discrimination, the MPD (Maximally Parsimonious Discrimination) program we describe here aims at the legibility of the resultaut class profiles. To do so, it (1) uses a minimal number of features by carrying out a global optimization, rather than heuristic greedy search; (2) produces conjunctive, or nearly conjunctive, profiles for the sake of intelligibility; and (3) gives all alternative solutions. The first goal steins from the familiar 1034 requirement that classes be distinguished by jointly necessary and sufficient descriptions. The second accords with the also familiar thesis that conjunctive descriptions are more comprehensible (they are the norm for typological classification (Itempel, 1965), and they are more readily acquired by experimental subjects than disjunctive ones (Brunet et. al., 1956)), and the third expresses the usefulness, for a diversity of reasons, of having all alternatives. Linguists would generally subscribe to all three requirements, hence the need for a computational tool with such focus. 1 In this paper, we briefly describe the MPD system (details may be found in Valdds-Pdrez and Pericliev, 1997; submitted) and focus on some linguistic applications, including componential analysis of kinship terms, distinctive feature analysis in phonology, language typology, and discrimination of aphasic syndromes from coded texts in the CHILDES database. For further interesting application areas of similar algorithms, of. Daelemans et. al., 1996 and Tanaka, 1996. 2 O v e r v i e w o f t h e M P D p r o g r a m The Maximally Parsimonious Discrimination program (MPD) is a general computational tool for inferring, given multiple classes (or, a typology), with attendant instances of these classes, the profiles (=descriptions) of these classes such that every class is contrasted from all remaining classes on the basis of feature values. Below is a brief description of the program. 2.1 Expressing contrasts The MPD program uses Boolean, nominal and numeric features to express contrasts, as follows: 1The profiling of multiple types, in actual fact, is a generic task of knowledge discovery, and the program we describe has found substantial applications in areas outside of linguistics, as e.g., in criminology, audiology, and datasets from the UC Irvine repository. However, we shall not discuss these applications here. • Two classes C1 and C2 are contrasted by a Boolean or nominal feature if the instances of C1 and the instances of C2 do not share a value. • Two classes C1 and C2 are contrasted by a numeric feature if the ranges of the instances of C1 and of C2 do not overlap. 2 MPD distinguishes two types of contrasts: (1) absolute contrasts when all the classes can be cleanly distinguished, and (2) partial contrasts when no absolute contrasts are possible between some pairwise classes, but absolute contrasts can nevertheless be achieved by deleting up to N per cent of the instances, where N is specified by the user. The program can also invent derived features--in the case when no successful (absolute) contrasts are so far achieved--the key idea of which is to express interactions between the given primitive features. Currently we have implemented inventing novel derived features via combining two primitive features (combining three or more primitive features is also possible, but has not so far been done owing to the likelihood of a combinatorial explosion): • Two Boolean features P and Q are combined into a set of two-place functions, none of which is reducible to a one-place fimction or to the negation of another two-place function in the set. The resulting set consists of P-and-Q, Por-Q, P-iff-Q, P-implies-Q, and Q-implies-P. • Two nominal features M and N are combined into a single two-place nominal function MxN. • Two numeric features X and Y are combined by forming their product and their quotient, a Both primitive and derived features are treated analogously in deciding whether two classes are contrasted by a feature, since derived features are legitimate Boolean, nominal or numeric features. It will be observed that contrasts by a nominal or numeric feature may (but will not necessarily) introduce a slight degree of disjunctiveness, which is to a somewhat greater extent the case in contrasts accomplished by derived features. Missing values do not present much problem, since they can be ignored without any need to est imate a value nor to discard the remaining informative features values of the instance. In the case of nominal features, missing values can be treated as just another legitimate feature value. 2.2 T h e s i m p l i c i t y c r i t e r i a MPD uses three intuitive criteria to guarantee the uncovering of the most parsimonious discrimination anlong classes: 2Besides these atomic feature values we may also support (hierarchically) structured values, but this will be of no concern here. 3Analogously to the Bacon program's invention of theoretical terms Langley et. al., 1987. 1. Minimize overall features. A set of classes may be demarcated using a number of overall feature sets of different cardinality; this criterion chooses those overall feature sets which have the smallest cardinality (i.e. are the shortest). 2. Minimize profiles. Given some overall feature set, one class may be demarcated--using only features from this se t -by a number of profiles of different eardinality; this criterion chooses those profiles having the smallest cardinality. 3. Maximize coordination. This criterion maximizes the coherence between class profiles in one discrimination model, 4 in the case when alternative profiles remain even after the application of the two previous simplicity criteria. 5 Due to space limitations, we cannot enter into the implementation details of these global optimization criteria, ill fact the most expensive mechanism of MPD. Suffice it to say here that they are implemented in a uniform way (in all three cases by converting a logic formula either CNF or something more complicated into a DNF formula), and all can use both sound and unsound (but good) heuristics to deal successfiflly with the potentially explosive combinatorics inherent in the conversion to DNF. 2.3 A n i l l u s t r a t i o n By way of (a simplified) illustration, let us consider the learning of the Bulgarian translational equivalents of the English verb feed on the basis of the case frames of the latter. Assume the following features/values, corresponding to the verbal slots: (1) NPl={hum,beas t ,phys-obj} , (2) VTR (binary feature denoting whether the verb is transitive or not), (3) NP2 (same values as NP1), (4) PP (binary feature expressing the obligatory presence of a prepositional phrase). An illustrative input to MPD is given in Table 1 (the sentences in the third column of the table are not a part of the input, and are only given for the sake of clarity, though, of course, would normally serve to deriving the instances by parsing). The output of the program is given in Table 2. MPD needs to find 10 pairwise contrasts between the 5 classes (i.e. N-choose-2, calculable by the formula N(N-1)/2 ), and it t~as successfully discriminated all 4in a "discrimination model" each class is described with a unique profile. 5By way of an abstract example, denote features by F1...Fn, and let Class 1 have the profiles: (1) F1 F2, (2) F1 V3, and Class 2: (1) F4 F2, (2) F4 FS, (3) F4 F6. Combining freely all alternative profiles with one another, we should get 6 discrimination models. Itowever, in Class 1 we have a choice between [F2 F3] (F1 must be used), and in Class 2 between [F2 F5 F6] (F4 must be used); this criterion, quite analogously to the previous two, will minimize this choice, selecting F2 in both cases, and hence yield the unique model Class 1: F1 F2, and Class 2:F4 F2.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigating the Possibilities of Reading Literary Texts in Light of a Sociolinguistic Perspective: Applications on the Case of Alice Walker’s Selected Short Stories

The present research tries to show how race, class, and gender and intersectionality in general, have their decisive impact on the black- American women; and how Alice Walker as a womanist, in her selected short stories, tries to show that black women in the U.S. suffer two-fold acts of oppression and discrimination, i.e. male violence affects all women in social life, irrespective of age or so...

متن کامل

New Applications on Linguistic Mathematical Structures and Stability Analysis of Linguistic Fuzzy Models

In this paper some algebraic structures for linguistic fuzzy models are defined for the first time. By definition linguistic fuzzy norm, stability of these systems can be considered. Two methods (normed-based & graphical-based) for stability analysis of linguist fuzzy systems will be presented. At the follow a new simple method for linguistic fuzzy numbers calculations is defined. At the end tw...

متن کامل

Exploiting Associations between Class Labels in Multi-label Classification

Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...

متن کامل

An integrated model of fuzzy multi-criteria decision making and stochastic programming for the evaluating and ranking of advanced manufacturing technologies

Investment appraisal in advanced manufacturing technologies (AMTs) has been receiving considerable attention over the past three decades. As stated in numerous studies, traditional engineering economic methods cannot adequately justify investments in AMTs. Thus, beside these methods, some other solutions have been proposed in this field. The methods applied in the evaluation of AMTs can be clas...

متن کامل

Arithmetic Aggregation Operators for Interval-valued Intuitionistic Linguistic Variables and Application to Multi-attribute Group Decision Making

The intuitionistic linguistic set (ILS) is an extension of linguisitc variable. To overcome the drawback of using single real number to represent membership degree and non-membership degree for ILS, the concept of interval-valued intuitionistic linguistic set (IVILS) is introduced through representing the membership degree and non-membership degree with intervals for ILS in this paper. The oper...

متن کامل

Strict fixed points of '{C}iri'{c}-generalized weak quasicontractive multi-valued mappings of integral type

‎‎Many authors such as Amini-Harandi‎, ‎Rezapour ‎et al., ‎Kadelburg ‎et al.‎‎, ‎have tried to find at least one fixed point for quasi-contractions when $alphain[frac{1}{2}‎, ‎1)$ but no clear answer exists right now and many of them either have failed or changed to a lighter version‎. In this paper‎, ‎we introduce some new strict fixed point results in the set of multi-valued '{C}iri'{c}-gener...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1998

A Procedure for Multi-Class Discrimination and some Linguistic Applications

نویسندگان

چکیده

منابع مشابه

Investigating the Possibilities of Reading Literary Texts in Light of a Sociolinguistic Perspective: Applications on the Case of Alice Walker’s Selected Short Stories

New Applications on Linguistic Mathematical Structures and Stability Analysis of Linguistic Fuzzy Models

Exploiting Associations between Class Labels in Multi-label Classification

An integrated model of fuzzy multi-criteria decision making and stochastic programming for the evaluating and ranking of advanced manufacturing technologies

Arithmetic Aggregation Operators for Interval-valued Intuitionistic Linguistic Variables and Application to Multi-attribute Group Decision Making

Strict fixed points of '{C}iri'{c}-generalized weak quasicontractive multi-valued mappings of integral type

عنوان ژورنال:

اشتراک گذاری